feat(memory): Hermes V3 long-term memory — W1 + W2 (sqlite_vec + import + read + memdebug) by liyoungc · Pull Request #2 · liyoungc/hermes-agent

liyoungc · 2026-05-02T08:07:49Z

Implements W1 + W2 + W3 + W4 (prepped) of the Hermes V3 long-term memory design.

W1 — schema bootstrap (done)

plugins/memory/sqlite_vec/ registering as a MemoryProvider plugin: episodes (hot tier) + semantic_facts (cold tier) + vec_facts (vec0 virtual table) + 3 sync triggers.

W2-1 — read path + embedding wrapper (done)

embed.py: async voyage_embed() (httpx, 128 batch, 3× backoff retry, locked dim/dtype 512/int8).
read.py: Fact dataclass + async read_memory() (vec0 prefilter k=50, SQL CTE rerank 0.7*sim + 0.3*exp(-age/90)), p95 logged. bump_hits() fire-and-forget. format_facts_for_prompt() with with_meta flag.

W2-2 — MEMORY.md import (done)

scripts/import_md.py: parses Topic: content §, slugifies hierarchy, preserves CJK, idempotent, atomic, --dry-run / --commit.
25 entries imported live on chococlaw; cosine retrieval verified.

W2-3 — wire prefetch + sync_turn (done, live on chococlaw)

Provider prefetch() runs in worker thread (5s timeout) returning the recall block.
check_same_thread=False + per-provider lock for cross-thread sqlite3.
Activation: config.yaml memory.provider: sqlite_vec (no env-var gate).

W2-4 — /memdebug slash command (done, live)

plugins/memdebug/ standalone plugin, registers /memdebug <q> and /memdebug rawsearch <q>. Logs invocations to memory.log.

W3-1 — kimi_extract + EXTRACT_PROMPT (done)

plugins/memory/sqlite_vec/extract.py: PROMPT verbatim from spec §5.2, PHI_BLACKLIST_CHANNELS short-circuit, tolerant JSON parser (handles 3 different Kimi output shapes observed in live testing).

W3-2 — write_episode + sync_turn write-back (done, live)

plugins/memory/sqlite_vec/write.py: per-turn write back, fast-track threshold 30d, JSONL failure log.
Wired into sync_turn after bump_hits via worker thread (30s timeout). msg_id synthesized via hash for idempotency.

W3-3 — weekly_promotion + weekly_apply (done, live, cron-scheduled)

plugins/memory/sqlite_vec/promotion.py: PROMOTION_PROMPT designed; weekly_promotion + weekly_apply async; render_digest_markdown matching spec §5.4; discord_post helper with chunking.
scripts/cron/weekly_promotion.py + weekly_apply.py thin wrappers (deployed to ~/.hermes/scripts/).
Cron entries added: 0 19 * * 6 (Sun 03:00 UTC+8) + 0 19 * * 0 (Mon 03:00 UTC+8).
Auto-fallback observed: Kimi-K2-Thinking 404s on synthetic.new → falls back to K2.5.

W3-4 — /memreview reject + /mem kill switch (done, live)

plugins/memreview/: /memreview reject <digest_id> writes sentinel; /mem off|on|status toggles MEM_OFF global kill switch.
MEM_OFF check wired into both sync_turn (skip write_episode) and weekly_promotion (skip Kimi call). Read path unaffected.

W4-1 — cutover prep (prepped, awaiting soak)

scripts/cutover/cutover.sh: idempotent bash script, dry-run by default. Archives MEMORY.md, disables legacy crons, smoke tests, restarts gateway.
Not executed — acceptance criteria require 1 full day soak + 1 weekly review cycle observed. User runs when ready (target 2026-05-24).

W4-2 — runbooks (done, in hermes-memory repo)

docs/runbooks/memory-rollback.md: per-week / per-month / full-W4 / failure-diagnostic procedures.
docs/runbooks/memory-monitoring.md: daily / weekly / monthly / quarterly health-check plan.

W1 schema fixes bundled across W2-W3

vec_facts was FLOAT[512] → changed to int8[512] (W2-1). vec0 INSERT requires vec_int8(blob) wrapper; UPDATE rejected on int8 even with wrapper, so trigger rewritten as DELETE+INSERT.
vec0 default L2 distance breaks the rerank formula on int8; added distance_metric=cosine (W2-2).
LOG_PATH = Path.home() resolves to /home/hermes inside container (not the /opt/data mount); switched to hermes_constants.get_hermes_home() (W2-4).
SQLite connection thread-safety: check_same_thread=False + per-provider lock (W2-3).
_apply_diff_atomic was holding BEGIN open across Voyage HTTP — embed BEFORE BEGIN (W3-3).

End-to-end live verification (chococlaw)

Path	Result
W2-1 retrieval × 3 queries	top-1 semantically correct, sim ∈ [0.43, 0.61]
W2-2 import 25 facts	one Voyage batch, idempotent re-run
W2-3 in-process smoke	`MemoryManager.prefetch_all` returns full markdown block
W2-4 `/memdebug` × 3	help / semantic / rawsearch all working
W3-1 kimi_extract × 4	pleasantry/long-lived/PHI ✓; short-lived ⚠ (spec prompt issue, see issue NousResearch#8)
W3-2 write_episode end-to-end	2 episodes + 1 fast-tracked fact, Kimi correctly inferred valid_to=2026-05-11
W3-3 promotion + apply	4 fixture episodes → Kimi diff (2 promote, 1 noise) → applied, semantic_facts 25→27→25 cleanup
W3-3 Discord post	digest rendered correctly to `#memory-review` (channel `1483958144596967464`)
W3-4 reject + apply	sentinel written → archived as `.rejected.json`, semantic_facts unchanged

Tests

522/522 green across the W1-W3 surface in container:

docker exec -w /opt/hermes hermes /opt/hermes/.venv/bin/python3 -m pytest \
  tests/plugins/ tests/scripts/test_import_md.py -q
522 passed, 2 warnings in 8.47s

Including: W1 schema (7), W2-1 read path (10), W2-2 import_md (12), W2-3 prefetch wiring (6), W2-4 memdebug (10), W3-1 extract (22), W3-2 write_episode (11), W3-3 promotion (17), W3-4 memreview (15) = 110 new tests plus existing sibling-plugin coverage that we did not regress.

Spec references

Notes for review

Personal fork on chococlaw; not for upstream NousResearch.
Free-tier Voyage 3 RPM until a payment method is added (200M token allowance unchanged).
Kimi-K2-Thinking unavailable on synthetic.new at the time of writing — auto-fallback to K2.5 produces acceptable promotion diffs.
open_db / init_db gained a keyword-only check_same_thread param (default True; only the provider passes False) — backwards-compatible.

Introduces a new MemoryProvider plugin implementing Hermes V3 long-term memory design (two-tier: hot episodes + cold curated semantic_facts, weekly human-approved promotion). W1 scope is schema only — no read or write path yet: - plugins/memory/sqlite_vec/{__init__,store,plugin.yaml,schema.sql} - episodes table (hot raw turn record, channel-scoped idempotent) - semantic_facts table (cold curated, with valid_from/valid_to validity windows borrowed from the MemPalace temporal-triple pattern) - vec_facts vec0 virtual table (512-dim float32) + 3 sync triggers - SqliteVecMemoryProvider class registers with MemoryProvider ABC but prefetch/sync_turn are no-ops until W2/W3 wire them. Tests (7/7 passing inside running hermes container): - bootstrap creates all expected tables/indexes/triggers - bootstrap is idempotent - semantic_facts column defaults populate (created_at, valid_from) - role CHECK constraint rejects values other than user/assistant - triggers keep vec_facts in sync on insert/update/delete - vec0 MATCH+k returns nearest neighbour - provider lifecycle round-trips Activates via $HERMES_HOME/config.yaml memory.provider: sqlite_vec (deferred; W4 cutover only). Refs liyoungc/hermes-memory#2 (W1-1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Implements the read path for the sqlite_vec memory plugin per docs/superpowers/specs/2026-05-02-hermes-memory-design.md §4. embed.py: async voyage_embed() with httpx.AsyncClient, 128-text batching, 3x exponential-backoff retry on 5xx, fail-loud on missing VOYAGE_API_KEY or 4xx. dim/dtype locked to spec values (512/int8) so config drift fails fast. read.py: Fact dataclass + async read_memory() using vec0 prefilter (k=50) and the SQL CTE rerank with locked weights 0.7*sim + 0.3*recency (90-day half-life). bump_hits() is fire-and-forget UPDATE that swallows sqlite errors with a warning. p95 latency logged as JSON line to ~/.hermes/logs/memory.log. W1 schema fix: vec_facts changed from FLOAT[512] to int8[512] to match spec §1.4 (Voyage 3.5-lite, 512-dim, int8). vec0 int8 columns require the vec_int8() SQL wrapper on INSERT, and reject UPDATE entirely even with the wrapper, so sf_after_update_embedding now does DELETE+INSERT. Tests: 10 new cases (mock httpx for voyage success/batching/5xx-retry/ 4xx/missing-key/empty-input; read_memory orders by score and filters expired; bump_hits increments and swallows errors; format_facts shape). 17/17 green. Refs liyoungc/hermes-memory#4

scripts/import_md.py seeds semantic_facts from ~/.hermes/memories/MEMORY.md per spec §6.1. Each "Topic: content §"-delimited entry maps to one semantic_fact row with entity prefix "禮揚." plus a slug of the topic, importance=2, valid_from=2026-05-10, valid_to=NULL. Hierarchical topics like "Tools & Access > ProtonMail" become entity "禮揚.tools_access.protonmail" so prefix queries still work. Embeds in batches of 128 via Voyage 3.5-lite. Idempotent: pre-INSERT (entity, fact) lookup skips duplicates so re-runs are safe. Wraps the batch-insert in BEGIN/COMMIT and rolls back on embed failure so partial imports never land. Supports --dry-run for preview and --commit for the real write. W1 schema fix bundled: vec0 column now declares distance_metric=cosine. Without this, the default L2 distance on int8 vectors produces sim values in the hundreds, breaking the 0.7*sim + 0.3*recency rerank formula entirely. Verified end-to-end on chococlaw: Q: "我太太生日" -> top hit "**生日**: 3/19" sim=0.604 OK Q: "AI as digital twin" -> top hit "Think of AI as a digital twin" sim=0.607 OK Tests: 12 new cases for import_md (slugify simple/hierarchy/CJK/empty; parse colon-missing/no-trailing-§; dry-run no-write; commit populates vec_facts via trigger; idempotent re-run; partial update embeds only new; rollback on embed failure leaves DB unchanged). 29/29 green including W1 + W2-1. Live import: 25 entries, 1 Voyage batch, all visible in semantic_facts and vec_facts on chococlaw:/opt/data/memories/memory.db. Refs liyoungc/hermes-memory#5

SqliteVecMemoryProvider.prefetch() now embeds the user message via Voyage 3.5-lite, runs read_memory() (vec0 prefilter k=50, SQL CTE rerank with cosine sim + 90-day half-life), and returns a markdown block: ## Recent relevant memories - [entity.slug] fact text (importance: N, age: D days) ... Activation is via config.yaml (memory.provider: sqlite_vec) — no env var gate. Per spec §4 the persona files (SOUL.md, USER.md, life-dimensions.md) stay in flat-file injection above this block; the gateway's existing prompt assembler handles ordering. Hits accounting (spec §4): retrieved fact IDs are stashed per session_id. sync_turn() runs bump_hits() on the cached IDs *after* the reply is delivered, so the UPDATE never sits on the user-facing latency path. Errors are swallowed. Async-in-sync bridge: the ABC's prefetch/sync_turn are sync, but the gateway already owns the asyncio loop, so asyncio.run inline raises. Solution is a worker thread with its own event loop and a 5s timeout kill-switch. To make sqlite3 cross-thread access legal, the connection opens with check_same_thread=False and self._lock serializes both read_memory and bump_hits. open_db()/init_db() now take a keyword-only check_same_thread param (default True; provider passes False). format_facts_for_prompt() gained a with_meta=True flag that appends "(importance: N, age: D days)" per fact, used by prefetch. /memdebug will keep the compact (with_meta=False) form. Tests: 6 new cases (markdown header, empty/trivial query no-op, voyage error swallow, sync_turn bumps then clears cache, worker timeout, with_meta format). 35/35 green including W1, W2-1, W2-2. Live activation verified on chococlaw: config.yaml memory.provider: '' -> sqlite_vec docker compose restart gateway Memory provider 'sqlite_vec' registered (0 tools) sqlite_vec memory ready at /opt/data/memories/memory.db End-to-end via MemoryManager.prefetch_all() against the real DB: "我太太生日" returns the full 8-fact markdown block top-1 = "**生日**: 3/19". Refs liyoungc/hermes-memory#6

plugins/memdebug/ is a standalone plugin that registers the /memdebug slash command via the hermes-agent ctx.register_command() surface. Memory plugins live in plugins/memory/ and load through the exclusive loader, which doesn't pass through the slash-command registry — keeping /memdebug separate is the cleanest split. Behaviour (spec §7.2): /memdebug -> short usage help /memdebug <query> -> top-8 from semantic_facts with score + sim + age + importance breakdown /memdebug rawsearch <query> -> substring scan of episodes (forensics) Each invocation logs to ~/.hermes/logs/memory.log as a JSON line so the F2 monitoring path (% top-1 hits judged useful) can aggregate weekly. Reaction logging deferred: the issue acceptance criterion calls for 👍/👎 reaction prompts on the embed message, but Discord-native rich embeds + reaction collectors require gateway-side plumbing (gateway/platforms/discord.py) that the spec §8 marks as iterate-after-W2 work. v1 emits a textual "React 👍/👎 to flag this retrieval." cue and relies on manual user reactions for now. LOG_PATH bug fix bundled: both this plugin and plugins/memory/sqlite_vec/ were resolving the log path via Path.home(), which inside the hermes container resolves to /home/hermes — not the /opt/data mount. Switched to hermes_constants.get_hermes_home() so logs land in the mounted ~/.hermes/logs/memory.log on the host. Confirmed live: $ tail -2 ~/.hermes/logs/memory.log {"ts": "2026-05-02T13:06:17", "q": "今晚晚餐", "k": 8, "n": 8, "sql_ms": 2.81} {"ts": "2026-05-02T13:06:17", "cmd": "memdebug", "q": "今晚晚餐", "n": 8, "ids": [...]} Also fixed a Python default-arg gotcha: _open_memory_db(path=DEFAULT_DB) bound DEFAULT_DB at def-time so monkeypatching the module global didn't take effect. Switched to lazy lookup (path = path or DEFAULT_DB). Tests: 10 new for memdebug (truncate, help/empty/rawsearch-no-arg, semantic with score breakdown, db-missing friendly message, rawsearch finds substring, rawsearch empty, sync entry-point dispatch, register() wires the right name + handler shape). 45/45 green including W1, W2-1, W2-2, W2-3. Live verification on chococlaw: /memdebug -> help text /memdebug 我太太生日 -> top-1 = "**生日**: 3/19" (sim=0.604) /memdebug rawsearch 致妤 -> "Episodes are written by W3" (placeholder) Refs liyoungc/hermes-memory#7

plugins/memory/sqlite_vec/extract.py implements the per-turn extraction stage of the write path. EXTRACT_PROMPT is a verbatim copy of spec §5.2 (HARD RULES 1-4 + JSON shape contract); paraphrasing here would compromise the F2 monitoring contract that downstream weekly review depends on. PHI_BLACKLIST_CHANNELS = {"cmio", "cbme", "medicine"} short-circuits to [] before any network call so hospital data never round-trips through synthetic.new. kimi_extract(user, assistant, channel, ts) calls Kimi K2.5 via synthetic.new's OpenAI-compatible endpoint with temperature=0.1, response_format=json_object, max_tokens=1024. Token usage is logged to ~/.hermes/logs/memory.log so weekly review can spot a runaway extract budget. JSON parser is intentionally tolerant: in live testing Kimi K2.5 returned three different shapes for the same prompt at temperature=0.1: 1. bare list [{...}] 2. wrapped object {"analysis": "...", "extracted_memories": [...]} 3. flat single fact {"type":"episodic","text":"...","entity":...} _parse_json_list() handles all three, falls back to the first list-valued field, and detects single-fact dicts by canonical key presence. Credential resolution: SYNTHETIC_API_KEY env var first (test override), then auth.json's credential_pool["custom:synthetic"] (canonical key on chococlaw). Older / alternate layouts (credential_pools, top-level) also accepted for resilience. Coercion drops malformed rows (bad type / blank text / unparseable importance), clamps importance to 1-5, and validates entity / valid_to_hint types. Only well-formed facts reach the caller. Tests: 22 cases (prompt verbatim assertions, PHI blacklist (3), parser shapes (5), coercion (3), short-circuits (2), mocked synthetic.new full flow (5), error paths (2), auth.json round-trip). 213/213 green across all memory + scripts tests. Live smoke test on chococlaw against real synthetic.new + Kimi K2.5: pleasantry ("好的") -> 0 facts ✓ long-lived ("追 sleep RCT") -> 1 fact (semantic, 禮揚.研究興趣) ✓ phi-channel ("cmio") -> 0 facts (short-circuit) ✓ short-lived ("致妤 7:30") -> 0 facts ⚠ (Kimi judges "about 致妤, not about 禮揚") The short-lived miss is a spec-level prompt issue, not an extract.py bug — the prompt says "memories about 禮揚" and Kimi reads that strictly. Spec §4.1's B1 acceptance example expects this turn to extract; matching B1 will require a spec edit (e.g. clarifying "about 禮揚 includes 禮揚's life context"). W3-3 weekly_promotion runs a separate thinking-mode Kimi pass over a week of episodes, which is the spec's intended catch for hot-path misses. Refs liyoungc/hermes-memory#8

plugins/memory/sqlite_vec/write.py implements the per-turn write-back half of the memory system per spec §5.1. Hot-path flow: 1. PHI gate — channel in PHI_BLACKLIST_CHANNELS short-circuits extract (raw episode rows still land; the LLM never sees PHI). 2. kimi_extract returns ExtractedFact list (or [] on failure; non-fatal — raw turn is still recorded so weekly_promotion can re-extract later). 3. voyage_embed batches the user msg, reply, and every fact text in one Voyage call. Empty strings are filtered out so we don't waste a Voyage slot. 4. INSERT 2 rows into episodes (user, assistant) inside a single BEGIN/COMMIT, with ON CONFLICT(channel, external_id) DO NOTHING for idempotent Discord redelivery / cron-retry / restart-replay. 5. Per-fact partition into fast-track vs stash: * valid_to_hint parses to <= today + 30 days -> INSERT into semantic_facts directly (the trigger mirrors into vec_facts so the next turn's prefetch can retrieve it). * everything else -> JSON-stash in episodes.metadata.stashed_facts for W3-3 weekly_promotion. 6. Any exception -> rollback + append the turn (raw text, ts, channel, msg_id, error) to ~/.hermes/logs/memory_write_failures.jsonl. The reply was already sent; we never propagate the error. Threshold rationale (spec §5.3): raised from the original 7d to 30d so short-lived facts ("下週會去日本玩五天") don't sit in metadata for a week before the next Sunday review fires. Provider wiring (plugins/memory/sqlite_vec/__init__.py): sync_turn() now schedules two worker-thread coroutines after the reply lands: bump_hits (5s budget) and write_episode (30s budget). The thread reuses self._lock so cross-thread sqlite3 access remains serialized. msg_id is synthesized by hashing (session_id, user, assistant, ts-to-the-minute) so Discord redeliveries within the same minute collapse via ON CONFLICT. No env-var gate (matches W2-3): activation is the same config.yaml memory.provider: sqlite_vec. Rolling back the write path specifically would require code change (or temporarily clearing the provider config), but the hot-path failure mode is a JSONL log entry, not a stalled reply, so the rollback risk is low. Tests: 11 new (parse_valid_to_hint edge cases, fast-track threshold edge / interior / over / null, two episode rows per turn, PHI skips extract but records, idempotent dup msg_id, short-lived fast-tracks + mirrors to vec_facts, long-lived stashes in metadata, mixed partition, embed failure -> JSONL + rollback, extract failure still records raw, empty turn no embed call). 205/205 green across all memory + memdebug + import tests. Live verification on chococlaw: Turn A: "今晚致妤大概 7:30 才到家" / "了解" -> 2 episodes, 0 facts (Kimi judged "about 致妤 not 禮揚", same prompt-wording observation logged in W3-1) Turn B: "我下週會去日本玩五天" / "酷..." -> 2 episodes, 1 fact fast-tracked: (.家庭) "下週會去日本玩五天" valid_from=2026-05-02 valid_to=2026-05-11 -> vec_facts auto-mirrored via trigger (semantic_facts 25 -> 26). -> Kimi correctly inferred valid_to from "下週" + "五天". Cleanup: smoke test data deleted from production DB before commit. Refs liyoungc/hermes-memory#9

Implements the cold-path of the memory system per spec §5.3 + §5.4. Two scripts (entry points in ~/.hermes/scripts/): scripts/weekly_promotion.py - cron Sun 03:00 UTC+8 (cron expr "0 19 * * 6" in UTC). Reads last 7 days of pending episodes, runs one Kimi call to produce a promotion diff, persists the diff to ~/.hermes/memories/pending_diffs/wk-YYYY-MM-DD.json, renders the digest markdown per spec §5.4, posts it to #memory-review via raw Discord HTTP. Does NOT stamp episodes.promoted_at. scripts/weekly_apply.py - cron Mon 03:00 UTC+8 ("0 19 * * 0" UTC). Purges pending_diffs/*.json older than 14 days at start. Loads the latest pending diff. If a <digest_id>.rejected sentinel file exists (written by /memreview reject in W3-4), archives the diff as rejected and exits. Otherwise applies promote / dedup / expire atomically and stamps episodes.promoted_at on the candidate rows. Both scripts emit a final stdout line {"wakeAgent": false} so the cron framework's wake gate skips the agent run — delivery is handled inside the script via the Discord HTTP POST helper, no LLM round-trip needed for the cron job itself. Core logic lives in plugins/memory/sqlite_vec/promotion.py: - PROMOTION_PROMPT designed to mirror EXTRACT_PROMPT style: same HARD RULES (PHI blacklist, pleasantry filter, synthetic handling, err-on-side-of-not-promoting), four explicit actions (PROMOTE / DEDUP_HIT / EXPIRE / DROP_AS_NOISE), and a verbatim output schema. - Per-candidate vec_search prefilter k=20 keeps the prompt small (only nearest-neighbor existing facts, not the whole active set, so prompt stays bounded as semantic_facts grows past 500 rows). - WeekDigest dataclass round-trips JSON, render_digest_markdown matches spec §5.4 layout (Promote / Dedup / Expire / Noise sections, emoji icons, character-truncated chunks for Discord 2000-char limit). - discord_post chunks long messages on newline boundaries before 1990 chars to stay under Discord's per-message ceiling. - memory_review_channel_id resolves the live channel from ~/.hermes/channel_directory.json (which stores platforms.discord as a list of {id, name, guild, type} dicts on chococlaw). Critical refactor: _apply_diff_atomic embeds promote-fact texts BEFORE opening the BEGIN/COMMIT, then writes blobs into the transaction. Holding the writer lock open across a Voyage HTTP round-trip would block hot-path write_episode for the duration of the call (300ms+). Live verification on chococlaw: Inserted 4 fixture episodes -> weekly_promotion -> Kimi call: Kimi-K2-Thinking 404'd on synthetic.new; auto-fallback to K2.5. Returned: 2 promote, 0 dedup, 0 expire, 1 drop_as_noise. weekly_apply applied diff: promoted=2 stamped=4 semantic_facts: 25 -> 27 (then back to 25 after smoke cleanup) Discord post test to #memory-review (channel 1483958144596967464): posted=True, format renders correctly with all four sections. Cron entries added to ~/.hermes/cron/jobs.json: Hermes Weekly Memory Promotion - 0 19 * * 6 (Sun 03:00 UTC+8) Hermes Weekly Memory Apply - 0 19 * * 0 (Mon 03:00 UTC+8) Both enabled, deliver=discord, script-driven (wake-gate=false). Tests: 17 new for promotion (prompt placeholders, hard-rule presence, candidate / neighbor formatting, digest_id format, WeekDigest round-trip, markdown renders all 4 sections, empty-section collapse, no-candidates short-circuit, dry-run no-write, real-run persists diff, no-pending-diff exit, rejection sentinel archives without applying, promote inserts + mirrors to vec_facts + stamps episodes, dedup bumps hits, expire sets valid_to, purge_old_pending). 222/222 green across all memory + memdebug + import + scripts tests. Operational notes: - Kimi-K2-Thinking unavailable on synthetic.new (404) - we auto-fallback to Kimi-K2.5 with temp=0.2. Quality looks acceptable; revisit if promotion misses obvious dedup opportunities. - The hot-path write_episode keeps stashing long-lived facts into episodes.metadata.stashed_facts, so the first real Sunday firing on a chocoprod week will draw from real data. Refs liyoungc/hermes-memory#10

The hermes scheduler hard-binds ~/.hermes/scripts/ as the only exec path for cron jobs, so the runtime copies must live there per-host. Keeping the canonical sources in the repo means PR review can see them and a fresh chococlaw rebuild is a 2-line cp + jobs.json patch. Refs liyoungc/hermes-memory#10

plugins/memreview/ is a standalone slash-command plugin registering two commands per spec §7.1: /memreview reject <digest_id> - writes ~/.hermes/memories/pending_diffs/<digest_id>.rejected Monday's weekly_apply reads this sentinel and archives the diff without applying any of its promote / dedup / expire actions; candidate episodes stay unstamped for next Sunday's window. /memreview pending - lists all pending digest_ids, flagging any that already carry a rejection sentinel. /mem off - global kill switch. Writes HERMES_HOME/MEM_OFF. Both SqliteVecMemoryProvider.sync_turn (hot path) and weekly_promotion (cold path) check for this file at the top of each call and short- circuit. Read path is unaffected. /mem on - removes the sentinel. /mem status - human-readable state of the kill switch + pending diff list. Why slash commands rather than Discord reactions: spec §7.1 explicitly chose slash because reactions don't reliably trigger webhook events across all bot adapters — a silent kill-switch failure is worse than no switch. Sentinel file design rationale: file-system state (rather than in-memory process flags) survives container restart, cross-thread visibility without locks, and gives the user a manual recovery path (touch / rm the file directly). Wired into the write paths: - plugins/memory/sqlite_vec/__init__.py: sync_turn now checks _mem_off_active() before scheduling the write_episode worker. bump_hits still fires (it's read-side accounting). - plugins/memory/sqlite_vec/promotion.py: weekly_promotion checks mem_off_active() at the top of the function and returns a "skipped: /mem off active" summary without reading episodes, calling Kimi, or persisting any diff. Both call sites import lazily from plugins.memreview so the memory plugin still loads cleanly even if memreview is uninstalled. Tests: 15 new (help text, pending list with/without rejected flag, reject invalid/unknown/valid digest_id, /mem off+on creates/deletes sentinel, /mem on idempotent, /mem status with and without pending, register() wires both commands, end-to-end reject -> apply archives without applying, /mem off short-circuits weekly_promotion before Kimi is called). 522/522 green across all plugin tests. Live verification on chococlaw: 1. wrote fake pending diff wk-2026-05-02.json (with a "should NEVER land" promote entry). 2. /memreview pending — listed it. 3. /memreview reject wk-2026-05-02 — sentinel created, confirmation reply. 4. weekly_apply — archived as wk-2026-05-02.rejected.json, sentinel auto-cleaned. semantic_facts unchanged (25 -> 25). The promote was correctly discarded. 5. /mem off / status / on cycle — sentinel toggled at /opt/data/MEM_OFF. Refs liyoungc/hermes-memory#11

Idempotent bash script that performs the W4 cutover steps when run with --commit. Default invocation is dry-run. Steps: 1. Pre-flight (verify memory.db exists, recent episodes accumulated) 2. Archive ~/.hermes/memories/MEMORY.md → MEMORY.md.archive-YYYY-MM-DD (chmod 444 for read-only) 3. Confirm config.yaml memory.provider == sqlite_vec 4. Disable legacy memory crons (Dimensions Memory Consolidation, Forgetting Curve) by flipping enabled=false in jobs.json 5. Smoke test the new provider end-to-end 6. Restart gateway Spec target date 2026-05-24, after observing one successful weekly review cycle. Caller is the user; script is non-destructive in dry-run mode and refuses to overwrite existing archives so re-running mid-fail is safe. Rollback procedure documented in hermes-memory/docs/runbooks/memory-rollback.md §3. Refs liyoungc/hermes-memory#12

liyoungc · 2026-05-02T14:42:53Z

Note (2026-05-02): The implementation introduced by this PR has been extracted to a dedicated plugin repo — see liyoungc/hermes-memory-plugin. The 25 files added here have been removed from this fork in #4. The git history of how we got here is preserved (this PR + its merge commit), but the current state of main no longer contains them. To install on a fresh hermes-agent checkout, run hermes-memory-plugin/install.sh.

Removes the 25 implementation files merged in #2 (W1-W4-2). The code now lives in a dedicated repo (liyoungc/hermes-memory-plugin) installed via: git clone git@github.com:liyoungc/hermes-memory-plugin.git ~/Projects/hermes-memory-plugin ~/Projects/hermes-memory-plugin/install.sh ~/Projects/hermes-agent The install symlinks plugins/memory/sqlite_vec, plugins/memdebug, and plugins/memreview from the plugin repo into hermes-agent/plugins/. For docker-based deploys, install.sh additionally writes a docker-compose.override.yml with bind mounts so the running container picks up live edits without an image rebuild. Why extract: - git pull upstream/main on this fork is now trivial again (no merge conflicts) - Plugin code can be installed on a vanilla NousResearch fork - Spec edits and prompt iterations land in one place DB at ~/.hermes/memories/memory.db is untouched. Cron jobs in ~/.hermes/cron/jobs.json migrate via install.sh. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

liyoungc and others added 5 commits May 2, 2026 08:07

liyoungc changed the title ~~feat(memory): bootstrap sqlite_vec plugin schema (W1)~~ feat(memory): Hermes V3 long-term memory — W1 + W2 (sqlite_vec + import + read + memdebug) May 2, 2026

liyoungc added 6 commits May 2, 2026 13:22

liyoungc merged commit 8591ee2 into main May 2, 2026
3 of 7 checks passed

liyoungc deleted the feat/memory-sqlite-vec-w1 branch May 2, 2026 14:12

liyoungc mentioned this pull request May 2, 2026

chore: extract memory plugin to liyoungc/hermes-memory-plugin #4

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(memory): Hermes V3 long-term memory — W1 + W2 (sqlite_vec + import + read + memdebug)#2

feat(memory): Hermes V3 long-term memory — W1 + W2 (sqlite_vec + import + read + memdebug)#2
liyoungc merged 11 commits intomainfrom
feat/memory-sqlite-vec-w1

liyoungc commented May 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

liyoungc commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

liyoungc commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

W1 — schema bootstrap (done)

W2-1 — read path + embedding wrapper (done)

W2-2 — MEMORY.md import (done)

W2-3 — wire prefetch + sync_turn (done, live on chococlaw)

W2-4 — /memdebug slash command (done, live)

W3-1 — kimi_extract + EXTRACT_PROMPT (done)

W3-2 — write_episode + sync_turn write-back (done, live)

W3-3 — weekly_promotion + weekly_apply (done, live, cron-scheduled)

W3-4 — /memreview reject + /mem kill switch (done, live)

W4-1 — cutover prep (prepped, awaiting soak)

W4-2 — runbooks (done, in hermes-memory repo)

W1 schema fixes bundled across W2-W3

End-to-end live verification (chococlaw)

Tests

Spec references

Notes for review

Uh oh!

Uh oh!

liyoungc commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

liyoungc commented May 2, 2026 •

edited

Loading